Integrating sentence- and word-level error identification for disfluency correction
نویسندگان
چکیده
While speaking spontaneously, speakers often make errors such as self-correction or false starts which interfere with the successful application of natural language processing techniques like summarization and machine translation to this data. There is active work on reconstructing this errorful data into a clean and fluent transcript by identifying and removing these simple errors. Previous research has approximated the potential benefit of conducting word-level reconstruction of simple errors only on those sentences known to have errors. In this work, we explore new approaches for automatically identifying speaker construction errors on the utterance level, and quantify the impact that this initial step has on wordand sentence-level reconstruction accuracy.
منابع مشابه
Sentence-Level Grammatical Error Identification as Sequence-to-Sequence Correction
We demonstrate that an attention-based encoder-decoder model can be used for sentence-level grammatical error identification for the Automated Evaluation of Scientific Writing (AESW) Shared Task 2016. The attention-based encoder-decoder models can be used for the generation of corrections, in addition to error identification, which is of interest for certain end-user applications. We show that ...
متن کامل應用不定長度特徵之條件隨機域於口語不流暢語流修正 (Disfluency Correction of Spontaneous Speech using Conditional Random Fields with Variable Length Features) [In Chinese]
This paper presents an approach to detecting and correcting edit disfluency based on conditional random fields with variable-length features. The variable-length features consist of word, chunk and sentence features. Conditional random fields (CRF) are adopted to model the properties of the edit disfluency, including repair, repetition and restart, for edit disfluency detection. For the evaluat...
متن کاملProsodic parallelism as a cue to repetition and error correction disfluency
Complex disfluencies that involve the repetition or correction of words are frequent in conversational speech, with repetition disfluencies alone accounting for over 20% of disfluencies. These disfluencies generally do not lead to comprehension errors for human listeners. We propose that the frequent occurrence of parallel prosodic features in the reparandum (REP) and alteration (ALT) intervals...
متن کاملContextual Maximum Entropy Model for Edit Disfluency Detection of Spontaneous Speech
This study describes an approach to edit disfluency detection based on maximum entropy (ME) using contextual features for rich transcription of spontaneous speech. The contextual features contain word-level, chunk-level and sentence-level features for edit disfluency modeling. Due to the problem of data sparsity, word-level features are determined according to the taxonomy of the primary featur...
متن کاملContext Sensitive Query Correction Method for Query-Based Text Summarization
Contextual spell correction is very important for real word error correction. It gives the correct word for an incorrect word in a particular sentence. The traditional spell checker can correct those misspelled words which are not present in dictionary but here we try to develop a spell checker which can give appropriate word on the basis of the contextual meaning of the sentence. This spell ch...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009